An Integrated Approach for Arabic-English Named Entity Translation
نویسندگان
چکیده
Translation of named entities (NEs), such as person names, organization names and location names is crucial for cross lingual information retrieval, machine translation, and many other natural language processing applications. Newly named entities are introduced on daily basis in newswire and this greatly complicates the translation task. Also, while some names can be translated, others must be transliterated, and, still, others are mixed. In this paper we introduce an integrated approach for named entity translation deploying phrase-based translation, word-based translation, and transliteration modules into a single framework. While Arabic based, the approach introduced here is a unified approach that can be applied to NE translation for any language pair.
منابع مشابه
تشخیص اسامی اشخاص با استفاده از تزریق کلمههای نامزد اسم در میدانهای تصادفی شرطی برای زبان عربی
Named Entity Recognition and Extraction are very important tasks for discovering proper names including persons, locations, date, and time, inside electronic textual resources. Accurate named entity recognition system is an essential utility to resolve fundamental problems in question answering systems, summary extraction, information retrieval and extraction, machine translation, video interpr...
متن کاملThe Reality of Arabic Fiction Translation into English: A Sociological Approach
English translations of texts associated with Arabic fiction remain largely unexplored from a sociological perspective. Drawing on Pierre Bourdieu’s sociology, this paper aims to examine the genesis of Arabic fiction translation into English as a socially situated activity. Works of Arabic fiction emerged in English translation in the early twentieth century. Since then, this intellectual field...
متن کاملDudley North visits North London: Learning When to Transliterate to Arabic
We report the results of our work on automating the transliteration decision of named entities for English to Arabic machine translation. We construct a classification-based framework to automate this decision, evaluate our classifier both in the limited news and the diverse Wikipedia domains, and achieve promising accuracy. Moreover, we demonstrate a reduction of translation error and an impro...
متن کاملA Log-Linear Block Transliteration Model based on Bi-Stream HMMs
We propose a novel HMM-based framework to accurately transliterate unseen named entities. The framework leverages features in letteralignment and letter n-gram pairs learned from available bilingual dictionaries. Letter-classes, such as vowels/non-vowels, are integrated to further improve transliteration accuracy. The proposed transliteration system is applied to out-of-vocabulary named-entitie...
متن کاملSYNERGY: A Named Entity Recognition System for Resource-scarce Languages such as Swahili using Online Machine Translation
Developing Named Entity Recognition (NER) for a new language using standard techniques requires collecting and annotating large training resources, which is costly and time-consuming. Consequently, for many widely spoken languages such as Swahili, there are no freely available NER systems. We present here a new technique to perform NER for new languages using online machine translation systems....
متن کامل